AITopics | undetectable backdoor

Collaborating Authors

undetectable backdoor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models

Neural Information Processing SystemsMar-19-2026, 04:40:41 GMT

As ML models become increasingly complex and integral to high-stakes domains such as finance and healthcare, they also become more susceptible to sophisticated adversarial attacks. We investigate the threat posed by undetectable backdoors, as defined in Goldwasser et al. [2022], in models developed by insidious external expert firms. When such backdoors exist, they allow the designer of the model to sell information on how to slightly perturb their input to change the outcome of the model. We develop a general strategy to plant backdoors to obfuscated neural networks, that satisfy the security properties of the celebrated notion of indistinguishability obfuscation. Applying obfuscation before releasing neural networks is a strategy that is well motivated to protect sensitive information of the external expert firm. Our method to plant backdoors ensures that even if the weights and architecture of the obfuscated model are accessible, the existence ofthe backdoor is still undetectable. Finally, we introduce the notion of undetectable backdoors to language models and extend our neural network backdoor attacks to such models based on the existence of steganographic functions.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)

Add feedback

Planting Undetectable Backdoors in Machine Learning Models

Goldwasser, Shafi, Kim, Michael P., Vaikuntanathan, Vinod, Zamir, Or

arXiv.org Artificial IntelligenceNov-9-2024

Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate "backdoor key", the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees. First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given black-box access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm or in Random ReLU networks. In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is "clean" or contains a backdoor. Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, our construction can produce a classifier that is indistinguishable from an "adversarially robust" classifier, but where every input has an adversarial example! In summary, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

artificial intelligence, backdoor, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2204.06974

Country:

North America > United States > Maryland > Baltimore (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback

Injecting Undetectable Backdoors in Deep Learning and Language Models

Kalavasis, Alkis, Karbasi, Amin, Oikonomou, Argyris, Sotiraki, Katerina, Velegkas, Grigoris, Zampetakis, Manolis

arXiv.org Machine LearningJun-9-2024

As ML models become increasingly complex and integral to high-stakes domains such as finance and healthcare, they also become more susceptible to sophisticated adversarial attacks. We investigate the threat posed by undetectable backdoors in models developed by insidious external expert firms. When such backdoors exist, they allow the designer of the model to sell information to the users on how to carefully perturb the least significant bits of their input to change the classification outcome to a favorable one. We develop a general strategy to plant a backdoor to neural networks while ensuring that even if the model's weights and architecture are accessible, the existence of the backdoor is still undetectable. To achieve this, we utilize techniques from cryptography such as cryptographic signatures and indistinguishability obfuscation. We further introduce the notion of undetectable backdoors to language models and extend our neural network backdoor attacks to such models based on the existence of steganographic functions.

backdoor, boolean circuit, language model, (15 more...)

arXiv.org Machine Learning

2406.0566

Country:

North America > Canada (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Undetectable backdoors for machine learning models

#artificialintelligenceJul-1-2022, 16:50:34 GMT

We're in the middle of a giant machine learning surge, with ML-based "classifiers" being used to make all kinds of decisions at speeds that humans could never match: ML decides everything from whether you get a bank loan to what your phone's camera judges to be a human face.

undetectable backdoor

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Machine learning has an alarming threat: undetectable backdoors

#artificialintelligenceMay-30-2022, 09:05:16 GMT

This article is part of our coverage of the latest in AI research. If an adversary gives you a machine learning model and secretly plants a malicious backdoor in it, what are the chances that you can discover it? The security of machine learning is becoming increasingly critical as ML models find their way into a growing number of applications. The new study focuses on the security threats of delegating the training and development of machine learning models to third parties and service providers. With the shortage of AI talent and resources, many organizations are outsourcing their machine learning work, using pre-trained models or online ML services.

backdoor, ml model, undetectable backdoor, (15 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Machine learning has a backdoor problem

#artificialintelligenceMay-23-2022, 20:54:55 GMT

backdoor, digital signature, ml model, (16 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AIs could be hacked with undetectable backdoors to make bad decisions

New ScientistMay-3-2022, 14:41:06 GMT

Artificial intelligence is increasingly used in business. But because of the way it is built, there is theoretical potential for the software to contain undetectable features that bypass its normal decision-making process, meaning it could be exploited by malicious third parties. For instance, an AI model tasked with shortlisting CVs for a job vacancy could be made to covertly prioritise any which include a deliberately obscure phrase.

ai model, make bad decision, undetectable backdoor, (1 more...)

New Scientist

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Machine-learning models vulnerable to undetectable backdoors

#artificialintelligenceApr-21-2022, 11:43:02 GMT

Boffins from UC Berkeley, MIT, and the Institute for Advanced Study in the United States have devised techniques to implant undetectable backdoors in machine learning (ML) models. Their work suggests ML models developed by third parties fundamentally cannot be trusted. In a paper that's currently being reviewed – "Planting Undetectable Backdoors in Machine Learning Models" – Shafi Goldwasser, Michael Kim, Vinod Vaikuntanathan, and Or Zamir explain how a malicious individual creating a machine learning classifier – an algorithm that classifies data into categories (eg "spam" or "not spam") – can subvert the classifier in a way that's not evident. "On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation," the paper explains. "Importantly, without the appropriate'backdoor key,' the mechanism is hidden and cannot be detected by any computationally-bounded observer."

backdoor, classifier, undetectable backdoor, (16 more...)

#artificialintelligence

Country: North America > United States (0.25)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback